Avoiding Duplicated Computation to Improve the Performance of Pfsp on Cuda Gpus

نویسندگان

  • Chao-Chin Wu
  • Kai-Cheng Wei
  • Wei-Shen Lai
  • Yun-Ju Li
چکیده

Graphics Processing Units (GPUs) have been emerged as powerful parallel compute platforms for various application domains. A GPU consists of hundreds or even thousands processor cores and adopts Single Instruction Multiple Threading (SIMT) architecture. Previously, we have proposed an approach that optimizes the Tabu Search algorithm for solving the Permutation Flowshop Scheduling Problem (PFSP) on a GPU by using a math function to generate all different permutations, avoiding the need of placing all the permutations in the global memory. Based on the research result, this paper proposes another approach that further improves the performance by avoiding duplicated computation among threads, which is incurred when any two permutations have the same prefix. Experimental results show that the GPU implementation of our proposed Tabu Search for PFSP runs up to 1.5 times faster than another GPU implementation proposed by Czapiński and Barnes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An approach to Improve Particle Swarm Optimization Algorithm Using CUDA

The time consumption in solving computationally heavy problems has always been a concern for computer programmers. Due to simplicity of its implementation, the PSO (Particle Swarm Optimization) is a suitable meta-heuristic algorithm for solving computationally heavy problems. However, despite the simplicity, the algorithm is inefficient for solving real computationally heavy problems but the pr...

متن کامل

Accelerating high-order WENO schemes using two heterogeneous GPUs

A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...

متن کامل

MPI- and CUDA- implementations of modal finite difference method for P-SV wave propagation modeling

Among different discretization approaches, Finite Difference Method (FDM) is widely used for acoustic and elastic full-wave form modeling. An inevitable deficit of the technique, however, is its sever requirement to computational resources. A promising solution is parallelization, where the problem is broken into several segments, and the calculations are distributed over different processors. ...

متن کامل

GPU Coprocessing

Site-specific modeling of wireless communications channels has historically been too computationally intensive to incorporate into commodity network simulators. Simulation cannot accurately predict the behavior of wireless networks in real-world environments without modeling the physical channel realistically. Realistic models typically involve large amounts of floating point computation, to wh...

متن کامل

Isolated Persian/Arabic handwriting characters: Derivative projection profile features, implemented on GPUs

For many years, researchers have studied high accuracy methods for recognizing the handwriting and achieved many significant improvements. However, an issue that has rarely been studied is the speed of these methods. Considering the computer hardware limitations, it is necessary for these methods to run in high speed. One of the methods to increase the processing speed is to use the computer pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016